Reproducible Manuscripts in R

Princeton University

Jason Geller, PH.D.(he/him)

2024-04-27

Introduction

The Problem

Word

Inside

Word issues

  • A .docx file is a compressed folder with lots of files
    • Your text is buried in with a lot of formatting information
  • Not reproducible
    • Code is divorced from writing
  • Difficult to maintain
    • Errors!
  • What do I share?
    • Lack of transparency

What do we want?

  • Combine narrative with code

  • Automatically generate figures and tables

  • Automatically render results in text

  • Format the content into a scientific paper (including citations!)

  • Something that looks pretty!

  • Rinse & repeat

Hello Quarto!

  • Quarto is the next generation of RMarkdown

Big universe

  • RMarkdown for EVERYONE

What is a Quarto?

Advantages

(1) Eliminate human error in copying and pasting results

We found that half of all published psychology papers that use null-hypothesis significance testing (NHST) contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion (Nuijten et al., 2016)

Advantages

(2) Easy revisions and specification of desired figures and tables

When revisions are requested, one might have to tweak tables and figures by hand constantly, leading to a major incentive never to rerun analyses because it would mean re-pasting and re-illustrating all the numbers and figures in a paper.

Advantages

(3) Promote computational reproducibility

  • Easy verification and replication of research findings

  • While programming environments may seem counter-intuitive for writing papers, they ultimately prevent mistakes and save time.

Let’s Get Started!

Getting started

  • Approach 1: Start from scratch (now)

    • Creating a Quarto manuscript

      • RStudio: New Project > New Directory > Quarto Manuscript
  • Note

    Always start a new project folder!

  • Approach 2: Start with a sample template (later)

Getting started

  • Let’s go to RStudio!

Output

Anatomy of a Quarto Document

Metadata & Header (YAML)

---
format: html
---
  • Wait… what’s the YAML acronym?

    • Originally: “Yet Another Markup Language”

    • Later: “YAML Ain’t Markup Language”

Code

```{r}
#| eval: true
library(dplyr)
mtcars %>% 
  group_by(cyl) %>%
  summarize(mean = mean(mpg), .groups = "drop")
```
# A tibble: 3 × 2
    cyl  mean
  <dbl> <dbl>
1     4  26.7
2     6  19.7
3     8  15.1

Text

# Heading 1
This is a sentence with some **bold text**, 
some *italic text* and 
an [image](image.png).